SlideShare a Scribd company logo
1 of 15
Progress Report
      2010.03.25
Outline

Discriminatively Trained, Multiscale,
Deformable Part Model

Object Detection with Discriminatively
Trained Part Based Models

Works to do
A Discriminatively Trained,
Multiscale, Deformable Part Model,
              CVPR’08
Part-Based Model
                  Template



 filters        part filters       deformation
resolution   finer resolution       models
           root filters      part filters   deformation
 Each  component has a root filter F0 models
        coarse resolution finer resolution
    and n part models (Fi, vi, di)
               Each component has a root filter F0
Feature Pyramid
                  Object hypothesis


                                                z = (p0,..., pn)
                                               p0 : location of root
                                            p1,..., pn : location of parts




                                              Score is sum of filter
                                                 scores minus
                                               deformation costs

Image pyramid         HOG feature pyramid


Multiscale model captures features at two-resolutions
φd (dx, dy) = (dx, dy, dx2 , dy 2 )            This tr
                                                                (4)
                                                                 locatio
                   Score of a hypothesis
    are deformation features.                                    value D
               Score Function
      Note that if di = (0, 0, 1, 1) the deformation cost for part to
    the i-th part is the squared distance between its actual of this
    position and its anchor position relative to the root. In      The
    general the deformation term” is an arbitrary separable time fr
                            “data cost         “spatial prior”
    quadratic function of the displacements.                     distanc
                         n                 n
      The bias term is introduced in the score to make 2     the   The
score(p0 , .of , pn ) = modelsφ(H, pi ) − when· we combine by the
    scores
            . . multiple Fi · comparable di (dxi , dyi )
                                                       2

                        i=0               i=1      displacements shifted
    them into a mixture model.
      The score of a hypothesis z can be expressed in terms respon
    of a dot product, β filters z), between a vector of model
                           · ψ(H,         deformation parameters
    parameters β and a vector ψ(H, z),                              scor

                  β = (F0 , . . .score(z). . . , βn·, Ψ(H, z)
                                  , Fn , d1 , = d b).           (5)
         ψ(H, z) = (φ(H, p0 ), . . . φ(H, pn ),
                                                                (6)     Recall
                   −φd (dx1 , dy1 ), filters and(dxconcatenation of HOG
                     concatenation   . . . , −φd  n , dyn ), 1).
                     deformation parameters           features and part in the
   This illustrates a connection between our models and compu
                                                  displacement features
   linear classifiers. We use this relationship for learning               Figu
Matching

Find The Best Hypothesis  Matching results
 • Define an overall score for each root location
      - Based on best placement of parts
           score(p0 ) = max score(p0 , . . . , pn ).
                         p1 ,...,pn



  •   High scoring root locations define detections

      - “sliding window approach”
  •   Efficient computation: dynamic programming +
      generalized distance transforms (max-convolution)
Semi-convexity
                     fβ (x) = max β · Φ(x, z) if dthe=squared 1) the deformationits actual
                                         Note that
                                       the i-th part is
                                                        i (0, 0, 1,
                                                                    distance between
                                                                                     cost for                           par
                                                                                                                        of
                                  z∈Z(x)        position and its anchor position relative to the root. In                  T
                                                general the deformation cost is an arbitrary separable                  tim
                  Latent SVM (MI-SVM)
                       Latent SVM
                  ! are model parameters        quadratic function of the displacements.
                                                  The bias term is introduced in the score to make the
                                                                                                                        dis
                                                                                                                           T
                  z are latent values           scores of multiple models comparable when we combine                    by
  • Maximum of convex functions is convex       them into a mixture model.
                                                  The score of a hypothesis z can be expressed in terms
                                                                                                                        shi
                                                                                                                        res
 Classifiers that score an = ( x , y x. using , y ) y ∈ {−1, 1}
           Training data D example , . . , x    of a dot product, β · ψ(H, z), between a vector of model
                                              1 parameters β and a vector ψ(H, z), i
                                                    1              n n
        (x) = max        Φ(x, z) is
  • ffβ(x) = z∈Z(x)ββ··Φ(x, z)! suchconvex f (x ) > 0
                                              in !            β = (F0 , . . . , Fn , d1 , . . . , dn , b).        (5)
     β     We max like to find
               would                 that: y                            i β             i
                                                     ψ(H, z) = (φ(H, p0 ), . . . φ(H, pn ),
             z∈Z(x)                                                                                               (6)   Re
  • max(0, 1 − yi fβ (xi )) is convex for negative examples
         Minimize
                                                                −φd (dx1 , dy1 ), . . . , −φd (dxn , dyn ), 1).
                                                This illustrates a connection between our models and
                                                                                                                        in
                                                                                                                        com
 ! are model parameters
                     1
                              Semi-convexity    linear classifiers. We use this relationship for learning
                                               n model parameters with the latent SVM framework.
                                                the
                                                                                                                           F
                                                                                                                           T
 z are latent D (β) = ||β||2 + C
            L values                               max(0, 1 − yi fβ (xi ))                                              loc
                                                                                                                        eac
                              2                 3.2 Matching
                                                                                                                        in
                                         n   i=1 detect objects in an image we compute an overall
             1                                 To                                                                       giv

Training data D = ( + 1 , y1 , . .placementnof thenparts,fβ (xii)) {−1, 1}
    LD (β) = ||β|| x C         max(0, 1 y yi
                                   ., x , − )
                   2              score for each root location according to the best possible                           sco
                                                                  y ∈
         •   2
         Maximum of convex functions=is convex. , p ).
                           i=1               score(p )     max score(p , . .
                                                                   0                      (7)  0             n
                                                                                                                        we
                                                                                                                        add

We would like to find ! such that: yi froot(xi ) > 0 detections while the
                                                                           p1 ,...,pn                                   fro
                                  High-scoring β locations define                                                           I
    Convex if latent values Φ(x, z) of the parts that yield a!are fixed
        fβ (x) = max β · forlocationsdefine convex in high-scoring root
         •                          positive examples                                                                   can
                                  location is a full object hypothesis.                                                 fun
                           z∈Z(x)                 By defining an overall score for each root location we                 Pi,
nimize                                          can detect multiple instances of an object (we assume
              max(0, 1 − yi fβ (xi )) is convex for negative examp
                                                there is at most one instance per root location). This
                                                                 Aft
Object Detection with
Discriminatively Trained Part Based
         Models, PAMI’09
Modification

Optimization function

Lower dimension but more informative
features

Bounding box prediction

Contextual Information
HOG with PCA
    0.45617     0.04390     0.02462     0.01339     0.00629     0.00556     0.00456     0.00391     0.00367




    0.00353     0.00310     0.00063     0.00030     0.00020     0.00018     0.00018     0.00017     0.00014




    0.00013     0.00011     0.00010     0.00010     0.00009     0.00009     0.00008     0.00008     0.00007




    0.00006     0.00005     0.00004     0.00004     0.00003     0.00003     0.00003     0.00002     0.00002




6. PCA of HOG features. Each eigenvector is displayed as a 4 by 9 matrix so that each row corresponds t
                             The first 11 eigenvectors
 alization factor and each column to one orientation bin. The eigenvalues are displayed on top of the eigenve
 near subspace spanned by the top 11 eigenvectors captures essentially all of the information in a feature v
                          capture almost all information
 how all of the top eigenvectors are either constant along each column or row of the matrix representation.


 C be a cell-based feature map computed by aggre-         7   P OST P ROCESSING
g a pixel-level feature map with 9 contrast insensi-
7.3 Contextual Information                                        overla
                                                                  box, o
We have implemented a simple procedure to rescore                 positiv
          Post-Processing
detections using contextual information.
   Let (D1 , . . . , Dk ) be a set of detections obtained using
                                                                  a syst
                                                                  with a
k different models (for different object categories) in an        diction
image I. Each detection (B, s) ∈ Di is defined by a                false p
bounding box B = (x1 , y1 , x2 , y2 ) and a score s. We           cision
define the context of I in terms of a k-dimensional vector            We
c(I) = (σ(s1 ), .a. regression model to figurethe high-
    Learning . , σ(sk )) where si is the score of out             each d
est the bounding boxDi , and σ(x) = 1/(1 + exp(−2x))
    scoring detection in coordinates                              on the
is a logistic function for renormalizing the scores.              obtain
   To rescore a detection (B, s) by an imagewith all
    Re-scoring the window in models I we build                    correc
a 25-dimensional feature vector with the original score
    scores of categories detection windows                           In s
of the detection, the top-left and bottom-right bounding          to con
box coordinates, and the image context,                           cow o
               g = (σ(s), x1 , y1 , x2 , y2 , c(I)).      (30)    detect
                                                                  box cr
The coordinates x1 , y1 , x2 , y2 ∈ [0, 1] are normalized by      catego
the width and height of the image. We use a category-             truth b
specific classifier to score this vector to obtain a new
PASCAL VOC 2008

Precision/Recall results on Person 2008
09 Base   09 BB   09 Cont   08
    Average Precisison
n                         0.407    0.423    0.431    0.42



            18




                                                          person
Work to Do


Cell model work modifications

Integrate other methods into cell model work

Another direction

More Related Content

What's hot

Bachelor thesis of do dai chi
Bachelor thesis of do dai chiBachelor thesis of do dai chi
Bachelor thesis of do dai chidodaichi2005
 
Mesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsMesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsGabriel Peyré
 
Lesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremLesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremMatthew Leingang
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD Editor
 
03 conditional random field
03 conditional random field03 conditional random field
03 conditional random fieldzukun
 
UMAP - Mathematics and implementational details
UMAP - Mathematics and implementational detailsUMAP - Mathematics and implementational details
UMAP - Mathematics and implementational detailsUmberto Lupo
 
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: EntropyDongseo University
 
Adaptive Signal and Image Processing
Adaptive Signal and Image ProcessingAdaptive Signal and Image Processing
Adaptive Signal and Image ProcessingGabriel Peyré
 
Math Conference Poster
Math Conference PosterMath Conference Poster
Math Conference Postermary41679
 
Machine learning (1)
Machine learning (1)Machine learning (1)
Machine learning (1)NYversity
 
Lecture5 limit
Lecture5 limitLecture5 limit
Lecture5 limitEhsan San
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2zukun
 
Computational tools for Bayesian model choice
Computational tools for Bayesian model choiceComputational tools for Bayesian model choice
Computational tools for Bayesian model choiceChristian Robert
 
Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum EntropyJiawang Liu
 
Lesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremLesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremMatthew Leingang
 
Lesson 3: The Limit of a Function (slides)
Lesson 3: The Limit of a Function (slides)Lesson 3: The Limit of a Function (slides)
Lesson 3: The Limit of a Function (slides)Matthew Leingang
 

What's hot (18)

Bachelor thesis of do dai chi
Bachelor thesis of do dai chiBachelor thesis of do dai chi
Bachelor thesis of do dai chi
 
Session 6
Session 6Session 6
Session 6
 
Mesh Processing Course : Geodesics
Mesh Processing Course : GeodesicsMesh Processing Course : Geodesics
Mesh Processing Course : Geodesics
 
Lesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremLesson 20: The Mean Value Theorem
Lesson 20: The Mean Value Theorem
 
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
 
03 conditional random field
03 conditional random field03 conditional random field
03 conditional random field
 
UMAP - Mathematics and implementational details
UMAP - Mathematics and implementational detailsUMAP - Mathematics and implementational details
UMAP - Mathematics and implementational details
 
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
2013-1 Machine Learning Lecture 02 - Andrew Moore: Entropy
 
Bayesian Core: Chapter 6
Bayesian Core: Chapter 6Bayesian Core: Chapter 6
Bayesian Core: Chapter 6
 
Adaptive Signal and Image Processing
Adaptive Signal and Image ProcessingAdaptive Signal and Image Processing
Adaptive Signal and Image Processing
 
Math Conference Poster
Math Conference PosterMath Conference Poster
Math Conference Poster
 
Machine learning (1)
Machine learning (1)Machine learning (1)
Machine learning (1)
 
Lecture5 limit
Lecture5 limitLecture5 limit
Lecture5 limit
 
CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2CVPR2010: higher order models in computer vision: Part 1, 2
CVPR2010: higher order models in computer vision: Part 1, 2
 
Computational tools for Bayesian model choice
Computational tools for Bayesian model choiceComputational tools for Bayesian model choice
Computational tools for Bayesian model choice
 
Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
 
Lesson 20: The Mean Value Theorem
Lesson 20: The Mean Value TheoremLesson 20: The Mean Value Theorem
Lesson 20: The Mean Value Theorem
 
Lesson 3: The Limit of a Function (slides)
Lesson 3: The Limit of a Function (slides)Lesson 3: The Limit of a Function (slides)
Lesson 3: The Limit of a Function (slides)
 

Similar to Team meeting 100325

Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsMatthew Leingang
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsMatthew Leingang
 
Lesson 30: The Definite Integral
Lesson 30: The  Definite  IntegralLesson 30: The  Definite  Integral
Lesson 30: The Definite IntegralMatthew Leingang
 
Polynomial functions modelllings
Polynomial functions modelllingsPolynomial functions modelllings
Polynomial functions modelllingsTarun Gehlot
 
ICCV2009: MAP Inference in Discrete Models: Part 2
ICCV2009: MAP Inference in Discrete Models: Part 2ICCV2009: MAP Inference in Discrete Models: Part 2
ICCV2009: MAP Inference in Discrete Models: Part 2zukun
 
Lesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusLesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusMatthew Leingang
 
Lesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusLesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusMatthew Leingang
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisSilvio Cesare
 
IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfgrssieee
 
Threshold network models
Threshold network modelsThreshold network models
Threshold network modelsNaoki Masuda
 
Introduction to Calculus of Variations
Introduction to Calculus of VariationsIntroduction to Calculus of Variations
Introduction to Calculus of VariationsDelta Pi Systems
 
BALANCING BOARD MACHINES
BALANCING BOARD MACHINESBALANCING BOARD MACHINES
BALANCING BOARD MACHINESbutest
 
Boolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisBoolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisIffat Anjum
 
Scale Invariant Feature Tranform
Scale Invariant Feature TranformScale Invariant Feature Tranform
Scale Invariant Feature TranformShanker Naik
 

Similar to Team meeting 100325 (20)

Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite Integrals
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite Integrals
 
Midterm II Review
Midterm II ReviewMidterm II Review
Midterm II Review
 
CRMS Calculus 2010 May 17, 2010
CRMS Calculus 2010 May 17, 2010CRMS Calculus 2010 May 17, 2010
CRMS Calculus 2010 May 17, 2010
 
Lesson 30: The Definite Integral
Lesson 30: The  Definite  IntegralLesson 30: The  Definite  Integral
Lesson 30: The Definite Integral
 
Polynomial functions modelllings
Polynomial functions modelllingsPolynomial functions modelllings
Polynomial functions modelllings
 
ICCV2009: MAP Inference in Discrete Models: Part 2
ICCV2009: MAP Inference in Discrete Models: Part 2ICCV2009: MAP Inference in Discrete Models: Part 2
ICCV2009: MAP Inference in Discrete Models: Part 2
 
Paper06
Paper06Paper06
Paper06
 
Lesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusLesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of Calculus
 
L04 (1)
L04 (1)L04 (1)
L04 (1)
 
Application of Derivatives
Application of DerivativesApplication of Derivatives
Application of Derivatives
 
Lesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of CalculusLesson 28: The Fundamental Theorem of Calculus
Lesson 28: The Fundamental Theorem of Calculus
 
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow AnalysisDetecting Bugs in Binaries Using Decompilation and Data Flow Analysis
Detecting Bugs in Binaries Using Decompilation and Data Flow Analysis
 
IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdf
 
Threshold network models
Threshold network modelsThreshold network models
Threshold network models
 
2010 calculusmvt3.2
2010 calculusmvt3.22010 calculusmvt3.2
2010 calculusmvt3.2
 
Introduction to Calculus of Variations
Introduction to Calculus of VariationsIntroduction to Calculus of Variations
Introduction to Calculus of Variations
 
BALANCING BOARD MACHINES
BALANCING BOARD MACHINESBALANCING BOARD MACHINES
BALANCING BOARD MACHINES
 
Boolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisBoolean Matching in Logic Synthesis
Boolean Matching in Logic Synthesis
 
Scale Invariant Feature Tranform
Scale Invariant Feature TranformScale Invariant Feature Tranform
Scale Invariant Feature Tranform
 

Team meeting 100325

  • 1. Progress Report 2010.03.25
  • 2. Outline Discriminatively Trained, Multiscale, Deformable Part Model Object Detection with Discriminatively Trained Part Based Models Works to do
  • 3. A Discriminatively Trained, Multiscale, Deformable Part Model, CVPR’08
  • 4. Part-Based Model Template filters part filters deformation resolution finer resolution models root filters part filters deformation Each component has a root filter F0 models coarse resolution finer resolution and n part models (Fi, vi, di) Each component has a root filter F0
  • 5. Feature Pyramid Object hypothesis z = (p0,..., pn) p0 : location of root p1,..., pn : location of parts Score is sum of filter scores minus deformation costs Image pyramid HOG feature pyramid Multiscale model captures features at two-resolutions
  • 6. φd (dx, dy) = (dx, dy, dx2 , dy 2 ) This tr (4) locatio Score of a hypothesis are deformation features. value D Score Function Note that if di = (0, 0, 1, 1) the deformation cost for part to the i-th part is the squared distance between its actual of this position and its anchor position relative to the root. In The general the deformation term” is an arbitrary separable time fr “data cost “spatial prior” quadratic function of the displacements. distanc n n The bias term is introduced in the score to make 2 the The score(p0 , .of , pn ) = modelsφ(H, pi ) − when· we combine by the scores . . multiple Fi · comparable di (dxi , dyi ) 2 i=0 i=1 displacements shifted them into a mixture model. The score of a hypothesis z can be expressed in terms respon of a dot product, β filters z), between a vector of model · ψ(H, deformation parameters parameters β and a vector ψ(H, z), scor β = (F0 , . . .score(z). . . , βn·, Ψ(H, z) , Fn , d1 , = d b). (5) ψ(H, z) = (φ(H, p0 ), . . . φ(H, pn ), (6) Recall −φd (dx1 , dy1 ), filters and(dxconcatenation of HOG concatenation . . . , −φd n , dyn ), 1). deformation parameters features and part in the This illustrates a connection between our models and compu displacement features linear classifiers. We use this relationship for learning Figu
  • 7. Matching Find The Best Hypothesis Matching results • Define an overall score for each root location - Based on best placement of parts score(p0 ) = max score(p0 , . . . , pn ). p1 ,...,pn • High scoring root locations define detections - “sliding window approach” • Efficient computation: dynamic programming + generalized distance transforms (max-convolution)
  • 8. Semi-convexity fβ (x) = max β · Φ(x, z) if dthe=squared 1) the deformationits actual Note that the i-th part is i (0, 0, 1, distance between cost for par of z∈Z(x) position and its anchor position relative to the root. In T general the deformation cost is an arbitrary separable tim Latent SVM (MI-SVM) Latent SVM ! are model parameters quadratic function of the displacements. The bias term is introduced in the score to make the dis T z are latent values scores of multiple models comparable when we combine by • Maximum of convex functions is convex them into a mixture model. The score of a hypothesis z can be expressed in terms shi res Classifiers that score an = ( x , y x. using , y ) y ∈ {−1, 1} Training data D example , . . , x of a dot product, β · ψ(H, z), between a vector of model 1 parameters β and a vector ψ(H, z), i 1 n n (x) = max Φ(x, z) is • ffβ(x) = z∈Z(x)ββ··Φ(x, z)! suchconvex f (x ) > 0 in ! β = (F0 , . . . , Fn , d1 , . . . , dn , b). (5) β We max like to find would that: y i β i ψ(H, z) = (φ(H, p0 ), . . . φ(H, pn ), z∈Z(x) (6) Re • max(0, 1 − yi fβ (xi )) is convex for negative examples Minimize −φd (dx1 , dy1 ), . . . , −φd (dxn , dyn ), 1). This illustrates a connection between our models and in com ! are model parameters 1 Semi-convexity linear classifiers. We use this relationship for learning n model parameters with the latent SVM framework. the F T z are latent D (β) = ||β||2 + C L values max(0, 1 − yi fβ (xi )) loc eac 2 3.2 Matching in n i=1 detect objects in an image we compute an overall 1 To giv Training data D = ( + 1 , y1 , . .placementnof thenparts,fβ (xii)) {−1, 1} LD (β) = ||β|| x C max(0, 1 y yi ., x , − ) 2 score for each root location according to the best possible sco y ∈ • 2 Maximum of convex functions=is convex. , p ). i=1 score(p ) max score(p , . . 0 (7) 0 n we add We would like to find ! such that: yi froot(xi ) > 0 detections while the p1 ,...,pn fro High-scoring β locations define I Convex if latent values Φ(x, z) of the parts that yield a!are fixed fβ (x) = max β · forlocationsdefine convex in high-scoring root • positive examples can location is a full object hypothesis. fun z∈Z(x) By defining an overall score for each root location we Pi, nimize can detect multiple instances of an object (we assume max(0, 1 − yi fβ (xi )) is convex for negative examp there is at most one instance per root location). This Aft
  • 9. Object Detection with Discriminatively Trained Part Based Models, PAMI’09
  • 10. Modification Optimization function Lower dimension but more informative features Bounding box prediction Contextual Information
  • 11. HOG with PCA 0.45617 0.04390 0.02462 0.01339 0.00629 0.00556 0.00456 0.00391 0.00367 0.00353 0.00310 0.00063 0.00030 0.00020 0.00018 0.00018 0.00017 0.00014 0.00013 0.00011 0.00010 0.00010 0.00009 0.00009 0.00008 0.00008 0.00007 0.00006 0.00005 0.00004 0.00004 0.00003 0.00003 0.00003 0.00002 0.00002 6. PCA of HOG features. Each eigenvector is displayed as a 4 by 9 matrix so that each row corresponds t The first 11 eigenvectors alization factor and each column to one orientation bin. The eigenvalues are displayed on top of the eigenve near subspace spanned by the top 11 eigenvectors captures essentially all of the information in a feature v capture almost all information how all of the top eigenvectors are either constant along each column or row of the matrix representation. C be a cell-based feature map computed by aggre- 7 P OST P ROCESSING g a pixel-level feature map with 9 contrast insensi-
  • 12. 7.3 Contextual Information overla box, o We have implemented a simple procedure to rescore positiv Post-Processing detections using contextual information. Let (D1 , . . . , Dk ) be a set of detections obtained using a syst with a k different models (for different object categories) in an diction image I. Each detection (B, s) ∈ Di is defined by a false p bounding box B = (x1 , y1 , x2 , y2 ) and a score s. We cision define the context of I in terms of a k-dimensional vector We c(I) = (σ(s1 ), .a. regression model to figurethe high- Learning . , σ(sk )) where si is the score of out each d est the bounding boxDi , and σ(x) = 1/(1 + exp(−2x)) scoring detection in coordinates on the is a logistic function for renormalizing the scores. obtain To rescore a detection (B, s) by an imagewith all Re-scoring the window in models I we build correc a 25-dimensional feature vector with the original score scores of categories detection windows In s of the detection, the top-left and bottom-right bounding to con box coordinates, and the image context, cow o g = (σ(s), x1 , y1 , x2 , y2 , c(I)). (30) detect box cr The coordinates x1 , y1 , x2 , y2 ∈ [0, 1] are normalized by catego the width and height of the image. We use a category- truth b specific classifier to score this vector to obtain a new
  • 13. PASCAL VOC 2008 Precision/Recall results on Person 2008
  • 14. 09 Base 09 BB 09 Cont 08 Average Precisison n 0.407 0.423 0.431 0.42 18 person
  • 15. Work to Do Cell model work modifications Integrate other methods into cell model work Another direction

Editor's Notes